RTSI: An Index Structure for Multi-Modal Real-Time Search on Live Audio Streaming Services
نویسندگان
چکیده
Audio streaming services (e.g., Mixlr, Ximalaya, Lizhi and Facebook Live Audio) have become increasingly popular due to the wide use of smart phones. More and more people are enjoying live audio broadcasting while they are doing various kinds of activities. On the other hand, the data volume of live audio streams is also ever increasing. Searching and indexing these audio streams is still an important and open problem, with the following challenges: (i) queries on the large number of audio streams need to be answered in real-time; (ii) a live audio stream is inserted into the index continuously to enable live audio streams to appear in query results, and the number of insertions is large which often becomes a performance issue. Existing studies on audio search either oversimplify the problem or simply ignore searching live audio streams. Moreover, existing studies do not explore the multi-modal property of audio streams. In this application paper, we propose a multi-modal and unified log structured merge-tree (a.k.a. LSM-tree which consists of multiple inverted indices) based index to support intensive insertions and real-time search on live audio stream applications. Our index natively supports two major types of indexing techniques in audio search: text based indexing and sound based indexing. The key technical challenges we need to address are that (i) a live audio stream may appear in multiple inverted indices due to the streaming nature, (ii) relevance, popularity and freshness of each audio stream need to be maintained in a way that allows fast accesses, and a query often matches to a large number of audio streams since audio streams usually contain many unique terms, and (iii) massive insertions are happening alongside with queries. To address the above challenges, we propose an index (called RTSI) which avoids traversing multiple inverted indices to compute the score of an audio stream. In RTSI, we propose various techniques to address the technical challenges. First, we use one inverted list which contains the sorted score of popularity, freshness and relevance, such that we can compute the top-k query results efficiently. Second, we devise an upper bound for the unchecked audio streams, such that the query answering process can be terminated early. Third, we create mirrors for the indices that need to be merged, such that queries can be answered in real-time when the indices are merging. We conduct extensive experiments on audio streams obtained from Ximalaya. The experimental results show that RTSI can answer a large number of queries in a real-time manner while concurrently handling massive insertions.
منابع مشابه
Lot Streaming in No-wait Multi Product Flowshop Considering Sequence Dependent Setup Times and Position Based Learning Factors
This paper considers a no-wait multi product flowshop scheduling problem with sequence dependent setup times. Lot streaming divide the lots of products into portions called sublots in order to reduce the lead times and work-in-process, and increase the machine utilization rates. The objective is to minimize the makespan. To clarify the system, mathematical model of the problem is presented. Sin...
متن کاملUsing Audio, Visual, and Lexical Features in a Multi-modal Virtual Meeting Director
Multi-modal recordings of meetings provide the basis for meeting browsing and for remote meetings. However it is often not useful to store or transmit all visual channels. In this work we show how a virtual meeting director selects one of seven possible video modes. We then present several audio, visual, and lexical features for a virtual director. In an experimental section we evaluate the fea...
متن کاملAn Adaptable Architecture for Mobile Streaming Applications
Many emerging mobile applications and services require playback of streaming media. In this paper we describe an adaptable system architecture to implement mobile streaming services. The main components of this architecture are the streaming server, the multicast proxy and the mobile client. The main novelty of our approach lies on the client which is designed to fit most mobile devices. The st...
متن کاملOmnidirectional View and Multi - Modal Streaming in 3 D
3D Tele-immersion (3DTI) technology allows full-body, multi-modal content delivery among geographically dispersed users. In 3DTI, user’s 3D model will be captured by multiple RGB-D (color plus depth) cameras surrounding user’s body. In addition, various sensors (e.g., motion sensors, medical sensors, wearable gaming consoles, etc.) specified by the application will be included to deliver a mult...
متن کاملCapacitated Single Allocation P-Hub Covering Problem in Multi-modal Network Using Tabu Search
The goals of hub location problems are finding the location of hub facilities and determining the allocation of non-hub nodes to these located hubs. In this work, we discuss the multi-modal single allocation capacitated p-hub covering problem over fully interconnected hub networks. Therefore, we provide a formulation to this end. The purpose of our model is to find the location of hubs and the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018